This is a walkthrough demonstrating how to generate SWNE plots alongside the Seurat manifold alignment pipeline from four pancreas datasets generated using different single cell RNA-seq technologies.
To save time we will be using the pre-computed Seurat object pancreas_integrated_seurat.Robj, which can be downloaded here.
First let’s load the required libraries
library(Seurat)
library(swne)
Let’s load the Seurat object
se.obj <- readRDS("~/swne/Data/pancreas_integrated_seurat.Robj")
We can “reconstruct” the gene expression matrix from the aligned correlated components and the corresponding gene loadings. This is not equivalent to the original gene expression matrix, as it only contains the reduced information from the correlated components, but it is batch-corrected and something we can run NMF on.
cca.aligned.expr <- ExtractDebatchedSeurat(se.obj)
Once we have the “reconstructed matrix”, we can run NMF and embed the components
k <- 20
n.cores <- 8
nmf.res <- RunNMF(cca.aligned.expr, k = k, alpha = 0, init = "ica", n.cores = n.cores)
swne.embedding <- EmbedSWNE(nmf.res$H, se.obj@snn, alpha.exp = 1.5, snn.exp = 1,
n_pull = 3, proj.method = "umap", dist.use = "cosine")
We project the full gene expression matrix to get gene loadings for the full set of genes
norm.counts <- ExtractNormCounts(se.obj, rescale = T, rescale.method = "log", batch = NULL)
## calculating variance fit ... using gam
nmf.res$W <- ProjectFeatures(norm.counts, nmf.res$H, n.cores = n.cores)
We use these gene loadings to identify key genes to embed, and we hide all the factors
gene.factor.df <- SummarizeAssocFeatures(nmf.res$W, features.return = 1)
genes.embed <- unique(gene.factor.df$feature)
swne.embedding <- EmbedFeatures(swne.embedding, nmf.res$W, genes.embed, n_pull = 3)
swne.embedding$H.coords$name <- ""
We pull out the clusters and batches
clusters <- se.obj@ident; names(clusters) <- se.obj@cell.names;
batch <- factor(se.obj@meta.data$tech); names(batch) <- se.obj@cell.names
We can then create the SWNE plot
PlotSWNE(swne.embedding, alpha.plot = 0.4, sample.groups = clusters, do.label = T,
label.size = 3.5, pt.size = 1.25, show.legend = F, seed = 42)
We also can show that there are no batch effects
PlotSWNE(swne.embedding, alpha.plot = 0.4, sample.groups = batch, do.label = F,
label.size = 3.5, pt.size = 0.5, show.legend = T, seed = 42)
t-SNE plot for comparison
tsne.emb <- GetCellEmbeddings(se.obj, "tsne")
PlotDims(tsne.emb, sample.groups = clusters, show.legend = F, seed = 42)